Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 189
Filtrar
1.
Adv Protein Chem Struct Biol ; 139: 383-403, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38448141

RESUMO

An uncommon opportunistic fungal infection known as mucormycosis is caused by a class of molds called mucoromycetes. Currently, antifungal therapy and surgical debridement are the primary treatment options for mucormycosis. Despite the importance of comprehensive knowledge on mucormycosis, there is a lack of well-annotated databases that provide all relevant information. In this study, we have gathered and organized all available information related to mucormycosis that include disease's genome, proteins, diagnostic methods. Furthermore, using the AlphaFold2.0 prediction tool, we have predicted the tertiary structures of potential drug targets. We have categorized the information into three major sections: "genomics/proteomics," "immunotherapy," and "drugs." The genomics/proteomics module contains information on different strains responsible for mucormycosis. The immunotherapy module includes putative sequence-based therapeutics predicted using established tools. Drugs module provides information on available drugs for treating the disease. Additionally, the drugs module also offers prerequisite information for designing computationally aided drugs, such as putative targets and predicted structures. In order to provide comprehensive information over internet, we developed a web-based platform MucormyDB (https://webs.iiitd.edu.in/raghava/mucormydb/).


Assuntos
Fármacos Anti-HIV , Mucormicose , Humanos , Mucormicose/tratamento farmacológico , Mucormicose/genética , Genômica , Bases de Dados Factuais , Sistemas de Liberação de Medicamentos
2.
Antibiotics (Basel) ; 13(2)2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38391554

RESUMO

Most of the existing methods developed for predicting antibacterial peptides (ABPs) are mostly designed to target either gram-positive or gram-negative bacteria. In this study, we describe a method that allows us to predict ABPs against gram-positive, gram-negative, and gram-variable bacteria. Firstly, we developed an alignment-based approach using BLAST to identify ABPs and achieved poor sensitivity. Secondly, we employed a motif-based approach to predict ABPs and obtained high precision with low sensitivity. To address the issue of poor sensitivity, we developed alignment-free methods for predicting ABPs using machine/deep learning techniques. In the case of alignment-free methods, we utilized a wide range of peptide features that include different types of composition, binary profiles of terminal residues, and fastText word embedding. In this study, a five-fold cross-validation technique has been used to build machine/deep learning models on training datasets. These models were evaluated on an independent dataset with no common peptide between training and independent datasets. Our machine learning-based model developed using the amino acid binary profile of terminal residues achieved maximum AUC 0.93, 0.98, and 0.94 for gram-positive, gram-negative, and gram-variable bacteria, respectively, on an independent dataset. Our method performs better than existing methods when compared with existing approaches on an independent dataset. A user-friendly web server, standalone package and pip package have been developed to facilitate peptide-based therapeutics.

3.
Front Bioinform ; 4: 1341479, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38379813

RESUMO

In the past, several methods have been developed for predicting the single-label subcellular localization of messenger RNA (mRNA). However, only limited methods are designed to predict the multi-label subcellular localization of mRNA. Furthermore, the existing methods are slow and cannot be implemented at a transcriptome scale. In this study, a fast and reliable method has been developed for predicting the multi-label subcellular localization of mRNA that can be implemented at a genome scale. Machine learning-based methods have been developed using mRNA sequence composition, where the XGBoost-based classifier achieved an average area under the receiver operator characteristic (AUROC) of 0.709 (0.668-0.732). In addition to alignment-free methods, we developed alignment-based methods using motif search techniques. Finally, a hybrid technique that combines the XGBoost model and the motif-based approach has been developed, achieving an average AUROC of 0.742 (0.708-0.816). Our method-MRSLpred-outperforms the existing state-of-the-art classifier in terms of performance and computation efficiency. A publicly accessible webserver and a standalone tool have been developed to facilitate researchers (webserver: https://webs.iiitd.edu.in/raghava/mrslpred/).

4.
Comput Biol Med ; 170: 108083, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38295479

RESUMO

B-cell is an essential component of the immune system that plays a vital role in providing the immune response against any pathogenic infection by producing antibodies. Existing methods either predict linear or conformational B-cell epitopes in an antigen. In this study, a single method was developed for predicting both types (linear/conformational) of B-cell epitopes. The dataset used in this study contains 3875 B-cell epitopes and 3996 non-B-cell epitopes, where B-cell epitopes consist of both linear and conformational B-cell epitopes. Our primary analysis indicates that certain residues (like Asp, Glu, Lys, and Asn) are more prominent in B-cell epitopes. We developed machine-learning based methods using different types of sequence composition and achieved the highest AUROC of 0.80 using dipeptide composition. In addition, models were developed on selected features, but no further improvement was observed. Our similarity-based method implemented using BLAST shows a high probability of correct prediction with poor sensitivity. Finally, we developed a hybrid model that combines alignment-free (dipeptide based random forest model) and alignment-based (BLAST-based similarity) models. Our hybrid model attained a maximum AUROC of 0.83 with an MCC of 0.49 on the independent dataset. Our hybrid model performs better than existing methods on an independent dataset used in this study. All models were trained and tested on 80 % of the data using a cross-validation technique, and the final model was evaluated on 20 % of the data, called an independent or validation dataset. A webserver and standalone package named "CLBTope" has been developed for predicting, designing, and scanning B-cell epitopes in an antigen sequence available at (https://webs.iiitd.edu.in/raghava/clbtope/).


Assuntos
Antígenos , Epitopos de Linfócito B , Epitopos de Linfócito B/química , Sequência de Aminoácidos , Antígenos/química , Conformação Molecular , Dipeptídeos
5.
Proteomics ; 24(6): e2300231, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37525341

RESUMO

Non-invasive diagnostics and therapies are crucial to prevent patients from undergoing painful procedures. Exosomal proteins can serve as important biomarkers for such advancements. In this study, we attempted to build a model to predict exosomal proteins. All models are trained, tested, and evaluated on a non-redundant dataset comprising 2831 exosomal and 2831 non-exosomal proteins, where no two proteins have more than 40% similarity. Initially, the standard similarity-based method Basic Local Alignment Search Tool (BLAST) was used to predict exosomal proteins, which failed due to low-level similarity in the dataset. To overcome this challenge, machine learning (ML) based models were developed using compositional and evolutionary features of proteins achieving an area under the receiver operating characteristics (AUROC) of 0.73. Our analysis also indicated that exosomal proteins have a variety of sequence-based motifs which can be used to predict exosomal proteins. Hence, we developed a hybrid method combining motif-based and ML-based approaches for predicting exosomal proteins, achieving a maximum AUROC of 0.85 and MCC of 0.56 on an independent dataset. This hybrid model performs better than presently available methods when assessed on an independent dataset. A web server and a standalone software ExoProPred (https://webs.iiitd.edu.in/raghava/exopropred/) have been created to help scientists predict and discover exosomal proteins and find functional motifs present in them.


Assuntos
Algoritmo Florestas Aleatórias , Análise de Sequência de Proteína , Humanos , Sequência de Aminoácidos , Análise de Sequência de Proteína/métodos , Proteínas/metabolismo , Software
6.
Comput Biol Med ; 167: 107594, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37918263

RESUMO

Advancements in cancer immunotherapy have shown significant outcomes in treating cancers. To design effective immunotherapy, it's important to understand immune response of a patient based on its genomic profile. However, analyses to do that requires proficiency in the bioinformatic methods. Swiftly growing sequencing technologies and statistical methods create a blockage for the scientists who want to find the biomarkers for different cancers but don't have detailed knowledge of coding or tool. Here, we are providing a web-based resource that gives scientists with no bioinformatics expertise, the ability to obtain the prognostic biomarkers for different cancer types at different levels. We computed prognostic biomarkers from 8346 cancer patients for twenty cancer types. These biomarkers were computed based on i) presence of 352 Human leukocyte antigen class-I, ii) 660959 tumor-specific HLA1 neobinders, and iii) expression profile of 153 cytokines. It was observed that survival risk of cancer patients depends on presence of certain type of HLA-I alleles; for example, liver hepatocellular carcinoma patients with HLA-A*03:01 are at lower risk. Our analysis indicates that neobinders of HLA-I alleles have high correlation with overall survival of certain type of cancer patients. For example, HLA-B*07:02 binders have 0.49 correlation with survival of lung squamous cell carcinoma and -0.77 with kidney chromophobe patients. Additionally, we computed prognostic biomarkers based on cytokine expressions. Higher expression of few cytokines is survival favorable like IL-2 for bladder urothelial carcinoma, whereas IL-5R is survival unfavorable for kidney chromophobe patients. Freely accessible to public, CancerHLA-I maintains raw and analysed data (https://webs.iiitd.edu.in/raghava/cancerhla1/).


Assuntos
Carcinoma de Células de Transição , Neoplasias Pulmonares , Neoplasias da Bexiga Urinária , Humanos , Citocinas/genética , Alelos , Carcinoma de Células de Transição/genética , Neoplasias da Bexiga Urinária/genética , Biomarcadores , Neoplasias Pulmonares/genética , Medição de Risco
7.
Protein Sci ; 32(11): e4785, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37733481

RESUMO

The identification of B-cell epitopes (BCEs) in antigens is a crucial step in developing recombinant vaccines or immunotherapies for various diseases. Over the past four decades, numerous in silico methods have been developed for predicting BCEs. However, existing reviews have only covered specific aspects, such as the progress in predicting conformational or linear BCEs. Therefore, in this paper, we have undertaken a systematic approach to provide a comprehensive review covering all aspects associated with the identification of BCEs. First, we have covered the experimental techniques developed over the years for identifying linear and conformational epitopes, including the limitations and challenges associated with these techniques. Second, we have briefly described the historical perspectives and resources that maintain experimentally validated information on BCEs. Third, we have extensively reviewed the computational methods developed for predicting conformational BCEs from the structure of the antigen, as well as the methods for predicting conformational epitopes from the sequence. Fourth, we have systematically reviewed the in silico methods developed in the last four decades for predicting linear or continuous BCEs. Finally, we have discussed the overall challenge of identifying continuous or conformational BCEs. In this review, we only listed major computational resources; a complete list with the URL is available from the BCinfo website (https://webs.iiitd.edu.in/raghava/bcinfo/).


Assuntos
Antígenos , Epitopos de Linfócito B , Epitopos de Linfócito B/química , Sequência de Aminoácidos
8.
Comput Biol Med ; 160: 106929, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37126926

RESUMO

Tumor Necrosis Factor alpha (TNF-α) is a pleiotropic pro-inflammatory cytokine that is crucial in controlling the signaling pathways within the immune cells. Recent studies reported that higher expression levels of TNF-α are associated with the progression of several diseases, including cancers, cytokine release syndrome in COVID-19, and autoimmune disorders. Thus, it is the need of the hour to develop immunotherapies or subunit vaccines to manage TNF-α progression in various disease conditions. In the pilot study, we proposed a host-specific in-silico tool for predicting, designing, and scanning TNF-α inducing epitopes. The prediction models were trained and validated on the experimentally validated TNF-α inducing/non-inducing epitopes from human and mouse hosts. Firstly, we developed alignment-free (machine learning based models using composition-based features of peptides) methods for predicting TNF-α inducing peptides and achieved maximum AUROC of 0.79 and 0.74 for human and mouse hosts, respectively. Secondly, an alignment-based (using BLAST) method has been used for predicting TNF-α inducing epitopes. Finally, a hybrid method (combination of alignment-free and alignment-based method) has been developed for predicting epitopes. Hybrid approach achieved maximum AUROC of 0.83 and 0.77 on an independent dataset for human and mouse hosts, respectively. We have also identified potential TNF-α inducing peptides in different proteins of HIV-1, HIV-2, SARS-CoV-2, and human insulin. The best models developed in this study has been incorporated in the webserver TNFepitope (https://webs.iiitd.edu.in/raghava/tnfepitope/), standalone package and GitLab (https://gitlab.com/raghavalab/tnfepitope).


Assuntos
COVID-19 , Fator de Necrose Tumoral alfa , Humanos , Animais , Camundongos , Epitopos , Projetos Piloto , SARS-CoV-2 , Peptídeos
9.
Methods Mol Biol ; 2673: 317-327, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37258924

RESUMO

Interleukin 6 (IL6) is a major pro-inflammatory cytokine that plays a pivotal role in both innate and adaptive immune responses. In the past, a number of studies reported that high level of IL6 promotes the proliferation of cancer, autoimmune disorders, and cytokine storm in COVID-19 patients. Thus, it is extremely important to identify and remove the antigenic regions from a therapeutic protein or vaccine candidate that may induce IL6-associated immunotoxicity. In order to overcome this challenge, our group has developed a computational tool, IL6pred, for discovering IL6-inducing peptides in a vaccine candidate. The aim of this chapter is to describe the potential applications and methodology of IL6pred. It sheds light on the prediction, designing, and scanning modules of IL6pred webserver and standalone package ( https://webs.iiitd.edu.in/raghava/il6pred/ ).


Assuntos
COVID-19 , Vacinas , Humanos , Interleucina-6/genética , COVID-19/prevenção & controle , Citocinas/metabolismo , Internet
10.
Methods Mol Biol ; 2673: 329-338, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37258925

RESUMO

Interleukins are a distinctive class of molecules exhibiting various immune signaling functions. Immunoregulatory cytokine, Interleukin 13 (IL13), is primarily synthesized by activated T-helper 2 cells, mast cells, and basophils. IL13, is known to stimulate many allergic and autoimmune diseases, such as asthma, rheumatoid arthritis, systemic sclerosis, ulcerative colitis, airway hyperresponsiveness, glycoprotein hypersecretion, and goblet cell hyperplasia. In addition to such disorders, IL13 also leads to carcinogenesis by inhibiting tumor immunosurveillance. Due to its role in various diseases, predicting IL13-inducing peptides or regions in a protein is vital to designing safe protein vaccines and therapeutics. IL13pred is an in silico tool which aids in identifying, predicting, and designing IL13-inducing peptides. The IL13pred web server and standalone package is easily accessible at ( https://webs.iiitd.edu.in/raghava/il13pred/ ).


Assuntos
Asma , Interleucina-13 , Humanos , Citocinas , Interleucinas , Peptídeos
11.
Front Microbiol ; 14: 1148579, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37032893

RESUMO

Phage therapy is a viable alternative to antibiotics for treating microbial infections, particularly managing drug-resistant strains of bacteria. One of the major challenges in designing phage-based therapy is to identify the most appropriate potential phage candidate to treat bacterial infections. In this study, an attempt has been made to predict phage-host interactions with high accuracy to identify the potential bacteriophage that can be used for treating a bacterial infection. The developed models have been created using a training dataset containing 826 phage- host interactions, and have been evaluated on a validation dataset comprising 1,201 phage-host interactions. Firstly, alignment-based models have been developed using similarity between phage-phage (BLASTPhage), host-host (BLASTHost) and phage-CRISPR (CRISPRPred), where we achieved accuracy between 42.4-66.2% for BLASTPhage, 55-78.4% for BLASTHost, and 43.7-80.2% for CRISPRPred across five taxonomic levels. Secondly, alignment free models have been developed using machine learning techniques. Thirdly, hybrid models have been developed by integrating the alignment-free models and the similarity-scores where we achieved maximum performance of (60.6-93.5%). Finally, an ensemble model has been developed that combines the hybrid and alignment-based models. Our ensemble model achieved highest accuracy of 67.9, 80.6, 85.5, 90, and 93.5% at Genus, Family, Order, Class, and Phylum levels on validation dataset. In order to serve the scientific community, we have also developed a webserver named PhageTB and provided a standalone software package (https://webs.iiitd.edu.in/raghava/phagetb/) for the same.

12.
Comput Biol Med ; 158: 106864, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37058758

RESUMO

Interleukin-5 (IL-5) can act as an enticing therapeutic target due to its pivotal role in several eosinophil-mediated diseases. The aim of this study is to develop a model for predicting IL-5 inducing antigenic regions in a protein with high precision. All models in this study have been trained, tested and validated on experimentally validated 1907 IL-5 inducing and 7759 non-IL-5 inducing peptides obtained from IEDB. Our primary analysis indicates that IL-5 inducing peptides are dominated by certain residues like Ile, Asn, and Tyr. It was also observed that binders of a wide range of HLA alleles can induce IL-5. Initially, alignment-based methods have been developed using similarity and motif search. These alignment-based methods provide high precision but poor coverage. In order to overcome this limitation, we explore alignment-free methods which are mainly machine learning-based models. Firstly, models have been developed using binary profiles and eXtreme Gradient Boosting-based model achieved a maximum AUC of 0.59. Secondly, composition-based models have been developed and our dipeptide-based random forest model achieved a maximum AUC of 0.74. Thirdly, random forest model developed using selected 250 dipeptides and achieved AUC 0.75 and MCC 0.29 on validation dataset; best among alignment-free models. In order to improve the performance, we developed an ensemble or hybrid method that combined alignment-based and alignment-free methods. Our hybrid method achieved AUC 0.94 with MCC 0.60 on a validation/independent dataset. The best hybrid model developed in this study has been incorporated into the user-friendly web server and a standalone package named 'IL5pred' (https://webs.iiitd.edu.in/raghava/il5pred/).


Assuntos
Interleucina-5 , Peptídeos , Simulação por Computador , Peptídeos/química , Computadores , Antígenos , Bases de Dados de Proteínas
13.
Drug Discov Today ; 28(4): 103523, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36764575

RESUMO

Over the years, numerous vaccines have been developed against viral infections; however, a complete database that provides comprehensive information on viral vaccines has been lacking. In this review, along with our freely accessible database ViralVacDB, we provide details of the viral vaccines, their type, routes of administration and approving agencies. This repository systematically covers additional information such as disease name, adjuvant, manufacturer, clinical status, age and dosage against 422 viral vaccines, including 145 approved vaccines and 277 in clinical trials. We anticipate that this database will be highly beneficial to researchers and others working in pharmaceuticals and immuno-informatics.


Assuntos
Vacinas Virais , Viroses , Humanos , Viroses/prevenção & controle , Bases de Dados Factuais
14.
Front Immunol ; 14: 1056101, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36742312

RESUMO

Introduction: Celiac disease (CD) is an autoimmune gastrointestinal disorder causes immune-mediated enteropathy against gluten. Gluten immunogenic peptides have the potential to trigger immune responses which leads to damage the small intestine. HLA-DQ2/DQ8 are major alleles that bind to epitope/antigenic region of gluten and induce celiac disease. There is a need to identify CD associated epitopes in protein-based foods and therapeutics. Methods: In this study, computational tools have been developed to predict CD associated epitopes and motifs. Dataset used for training, testing and evaluation contain experimentally validated CD associated and non-CD associate peptides. We perform positional analysis to identify the most significant position of an amino acid residue in the peptide and checked the frequency of HLA alleles. We also compute amino acid composition to develop machine learning based models. We also developed ensemble method that combines motif-based approach and machine learning based models. Results and Discussion: Our analysis support existing hypothesis that proline (P) and glutamine (Q) are highly abundant in CD associated peptides. A model based on density of P&Q in peptides has been developed for predicting CD associated peptides which achieve maximum AUROC 0.98 on independent data. We discovered motifs (e.g., QPF, QPQ, PYP) which occurs specifically in CD associated peptides. We also developed machine learning based models using peptide composition and achieved maximum AUROC 0.99. Finally, we developed ensemble method that combines motif-based approach and machine learning based models. The ensemble model-predict CD associated motifs with 100% accuracy on an independent dataset, not used for training. Finally, the best models and motifs has been integrated in a web server and standalone software package "CDpred". We hope this server anticipate the scientific community for the prediction, designing and scanning of CD associated peptides as well as CD associated motifs in a protein/peptide sequence (https://webs.iiitd.edu.in/raghava/cdpred/).


Assuntos
Doença Celíaca , Humanos , Epitopos , Glutens , Peptídeos , Aminoácidos
15.
Database (Oxford) ; 20232023 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-36747479

RESUMO

Saliva as a non-invasive diagnostic fluid has immense potential as a tool for early diagnosis and prognosis of patients. The information about salivary biomarkers is broadly scattered across various resources and research papers. It is important to bring together all the information on salivary biomarkers to a single platform. This will accelerate research and development in non-invasive diagnosis and prognosis of complex diseases. We collected widespread information on five types of salivary biomarkers-proteins, metabolites, microbes, micro-ribonucleic acid (miRNA) and genes found in humans. This information was collected from different resources that include PubMed, the Human Metabolome Database and SalivaTecDB. Our database SalivaDB contains a total of 15 821 entries for 201 different diseases and 48 disease categories. These entries can be classified into five categories based on the type of biomolecules; 6067, 3987, 2909, 2272 and 586 entries belong to proteins, metabolites, microbes, miRNAs and genes, respectively. The information maintained in this database includes analysis methods, associated diseases, biomarker type, regulation status, exosomal origin, fold change and sequence. The entries are linked to relevant biological databases to provide users with comprehensive information. We developed a web-based interface that provides a wide range of options like browse, keyword search and advanced search. In addition, a similarity search module has been integrated which allows users to perform a similarity search using Basic Local Alignment Search Tool and Smith-Waterman algorithm against biomarker sequences in SalivaDB. We created a web-based database-SalivaDB, which provides information about salivary biomarkers found in humans. A wide range of web-based facilities have been integrated to provide services to the scientific community. https://webs.iiitd.edu.in/raghava/salivadb/.


Assuntos
Bases de Dados Factuais , MicroRNAs , Humanos , Algoritmos , Biomarcadores , MicroRNAs/genética , Software , Saliva
16.
J Comput Biol ; 30(2): 204-222, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36251780

RESUMO

In the last three decades, a wide range of protein features have been discovered to annotate a protein. Numerous attempts have been made to integrate these features in a software package/platform so that the user may compute a wide range of features from a single source. To complement the existing methods, we developed a method, Pfeature, for computing a wide range of protein features. Pfeature allows to compute more than 200,000 features required for predicting the overall function of a protein, residue-level annotation of a protein, and function of chemically modified peptides. It has six major modules, namely, composition, binary profiles, evolutionary information, structural features, patterns, and model building. Composition module facilitates to compute most of the existing compositional features, plus novel features. The binary profile of amino acid sequences allows to compute the fraction of each type of residue as well as its position. The evolutionary information module allows to compute evolutionary information of a protein in the form of a position-specific scoring matrix profile generated using Position-Specific Iterative Basic Local Alignment Search Tool (PSI-BLAST); fit for annotation of a protein and its residues. A structural module was developed for computing of structural features/descriptors from a tertiary structure of a protein. These features are suitable to predict the therapeutic potential of a protein containing non-natural or chemically modified residues. The model-building module allows to implement various machine learning techniques for developing classification and regression models as well as feature selection. Pfeature also allows the generation of overlapping patterns and features from a protein. A user-friendly Pfeature is available as a web server python library and stand-alone package.


Assuntos
Proteínas , Software , Proteínas/química , Peptídeos , Sequência de Aminoácidos , Aprendizado de Máquina , Bases de Dados de Proteínas , Análise de Sequência de Proteína/métodos
17.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36575815

RESUMO

In the current era, one of the major challenges is to manage the treatment of drug/antibiotic-resistant strains of bacteria. Phage therapy, a century-old technique, may serve as an alternative to antibiotics in treating bacterial infections caused by drug-resistant strains of bacteria. In this review, a systematic attempt has been made to summarize phage-based therapy in depth. This review has been divided into the following two sections: general information and computer-aided phage therapy (CAPT). In the case of general information, we cover the history of phage therapy, the mechanism of action, the status of phage-based products (approved and clinical trials) and the challenges. This review emphasizes CAPT, where we have covered primary phage-associated resources, phage prediction methods and pipelines. This review covers a wide range of databases and resources, including viral genomes and proteins, phage receptors, host genomes of phages, phage-host interactions and lytic proteins. In the post-genomic era, identifying the most suitable phage for lysing a drug-resistant strain of bacterium is crucial for developing alternate treatments for drug-resistant bacteria and this remains a challenging problem. Thus, we compile all phage-associated prediction methods that include the prediction of phages for a bacterial strain, the host for a phage and the identification of interacting phage-host pairs. Most of these methods have been developed using machine learning and deep learning techniques. This review also discussed recent advances in the field of CAPT, where we briefly describe computational tools available for predicting phage virions, the life cycle of phages and prophage identification. Finally, we describe phage-based therapy's advantages, challenges and opportunities.


Assuntos
Infecções Bacterianas , Bacteriófagos , Terapia por Fagos , Humanos , Terapia por Fagos/métodos , Prófagos , Genômica , Bactérias/genética , Antibacterianos
18.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36524996

RESUMO

There are a number of antigens that induce autoimmune response against ß-cells, leading to type 1 diabetes mellitus (T1DM). Recently, several antigen-specific immunotherapies have been developed to treat T1DM. Thus, identification of T1DM associated peptides with antigenic regions or epitopes is important for peptide based-therapeutics (e.g. immunotherapeutic). In this study, for the first time, an attempt has been made to develop a method for predicting, designing, and scanning of T1DM associated peptides with high precision. We analysed 815 T1DM associated peptides and observed that these peptides are not associated with a specific class of HLA alleles. Thus, HLA binder prediction methods are not suitable for predicting T1DM associated peptides. First, we developed a similarity/alignment based method using Basic Local Alignment Search Tool and achieved a high probability of correct hits with poor coverage. Second, we developed an alignment-free method using machine learning techniques and got a maximum AUROC of 0.89 using dipeptide composition. Finally, we developed a hybrid method that combines the strength of both alignment free and alignment-based methods and achieves maximum area under the receiver operating characteristic of 0.95 with Matthew's correlation coefficient of 0.81 on an independent dataset. We developed a web server 'DMPPred' and stand-alone server for predicting, designing and scanning T1DM associated peptides (https://webs.iiitd.edu.in/raghava/dmppred/).


Assuntos
Diabetes Mellitus Tipo 1 , Humanos , Diabetes Mellitus Tipo 1/genética , Simulação por Computador , Peptídeos/química , Epitopos/química , Software
19.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36516298

RESUMO

This paper describes a method Pprint2, which is an improved version of Pprint developed for predicting RNA-interacting residues in a protein. Training and independent/validation datasets used in this study comprises of 545 and 161 non-redundant RNA-binding proteins, respectively. All models were trained on training dataset and evaluated on the validation dataset. The preliminary analysis reveals that positively charged amino acids such as H, R and K, are more prominent in the RNA-interacting residues. Initially, machine learning based models have been developed using binary profile and obtain maximum area under curve (AUC) 0.68 on validation dataset. The performance of this model improved significantly from AUC 0.68 to 0.76, when evolutionary profile is used instead of binary profile. The performance of our evolutionary profile-based model improved further from AUC 0.76 to 0.82, when convolutional neural network has been used for developing model. Our final model based on convolutional neural network using evolutionary information achieved AUC 0.82 with Matthews correlation coefficient of 0.49 on the validation dataset. Our best model outperforms existing methods when evaluated on the independent/validation dataset. A user-friendly standalone software and web-based server named 'Pprint2' has been developed for predicting RNA-interacting residues (https://webs.iiitd.edu.in/raghava/pprint2 and https://github.com/raghavagps/pprint2).


Assuntos
Aminoácidos , RNA , Sítios de Ligação , RNA/metabolismo , Software , Proteínas de Ligação a RNA/metabolismo
20.
Front Microbiol ; 13: 1042127, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36452927

RESUMO

Sigma70 factor plays a crucial role in prokaryotes and regulates the transcription of most of the housekeeping genes. One of the major challenges is to predict the sigma70 promoter or sigma70 factor binding site with high precision. In this study, we trained and evaluate our models on a dataset consists of 741 sigma70 promoters and 1,400 non-promoters. We have generated a wide range of features around 8,000, which includes Dinucleotide Auto-Correlation, Dinucleotide Cross-Correlation, Dinucleotide Auto Cross-Correlation, Moran Auto-Correlation, Normalized Moreau-Broto Auto-Correlation, Parallel Correlation Pseudo Tri-Nucleotide Composition, etc. Our SVM based model achieved maximum accuracy 97.38% with AUROC 0.99 on training dataset, using 200 most relevant features. In order to check the robustness of the model, we have tested our model on the independent dataset made by using RegulonDB10.8, which included 1,134 sigma70 and 638 non-promoters, and able to achieve accuracy of 90.41% with AUROC of 0.95. Our model successfully predicted constitutive promoters with accuracy of 81.46% on an independent dataset. We have developed a method, Sigma70Pred, which is available as webserver and standalone packages at https://webs.iiitd.edu.in/raghava/sigma70pred/. The services are freely accessible.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA